翻訳と辞書 |
multimodal interaction : ウィキペディア英語版 | multimodal interaction
Multimodal interaction provides the user with multiple modes of interacting with a system. A multimodal interface provides several distinct tools for input and output of data. For example, a multimodal question answering system employs multiple modalities (such as text and photo) at both question (input) and answer (output) level.〔Mittal et al. (2011). "(Versatile question answering systems: seeing in synthesis )", Int. J. of Intelligent Information Database Systems, 5(2), 119-142.〕 == Introduction == Multimodal human-computer interaction refers to the “interaction with the virtual and physical environment through natural modes of communication”,〔Bourguet, M.L.( 2003). “Designing and Prototyping Multimodal Commands”. Proceedings of Human-Computer Interaction (INTERACT'03), pp. 717-720.〕 i. e. the modes involving the five human senses.〔Ferri, F., & Paolozzi, S. (2009). Analyzing Multimodal Interaction. In P. Grifoni (Ed.), Multimodal Human Computer Interaction and Pervasive Services (pp. 19-33). Hershey, PA: Information Science Reference. doi:10.4018/978-1-60566-386-9.ch002〕 This implies that multimodal interaction enables a more free and natural communication, interfacing users with automated systems in both input and output.〔Stivers, T., Sidnell, J. Introduction: Multimodal interaction. Semiotica, 156(1/4), pp. 1-20. 2005.〕 Specifically, multimodal systems can offer a flexible, efficient and usable environment allowing users to interact through input modalities, such as speech, handwriting, hand gesture and gaze, and to receive information by the system through output modalities, such as speech synthesis, smart graphics and others modalities, opportunely combined. Then a multimodal system has to recognize the inputs from the different modalities combining them according to temporal and contextual constraints〔Caschera M. C., Ferri F., Grifoni P. (2007). “Multimodal interaction systems: information and time features”. International Journal of Web and Grid Services (IJWGS), Vol. 3 - Issue 1, pp 82-99.〕 in order to allow their interpretation. This process is known as multimodal fusion, and it is the object of several research works from nineties to now.〔D'Ulizia, A., Ferri, F. and Grifoni, P. (2010). “Generating Multimodal Grammars for Multimodal Dialogue Processing”. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, Vol 40, no 6, pp. 1130 – 1145.〕〔D’Ulizia , A. (2009). “Exploring Multimodal Input Fusion Strategies”. In: Grifoni P (ed) Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility. IGI Publishing, pp. 34-57.〕〔Sun, Y., Shi, Y., Chen, F. and Chung , V.(2007). “An Efficient Multimodal Language Processor for Parallel Input Strings in Multimodal Input Fusion,” in Proc. of the international Conference on Semantic Computing, pp. 389-396.〕〔Russ, G., Sallans, B., Hareter, H. (2005). “Semantic Based Information Fusion in a Multimodal Interface”. International Conference on Human-Computer Interaction (HCI’05), Las Vegas, Nevada, USA, 20–23 June, pp 94-100.〕〔Corradini, A., Mehta M., Bernsen, N.O., Martin, J.-C. (2003). “Multimodal Input Fusion in Human-Computer Interaction on the Example of the on-going NICE Project”. In Proceedings of the NATO-ASI conference on Data Fusion for Situation Monitoring, Incident Detection, Alert and Response Management, Yerevan, Armenia.〕〔Pavlovic, V.I., Berry, G.A., Huang, T.S. (1997). “Integration of audio/visual information for use in human-computer intelligent interaction”. Proceedings of the 1997 International Conference on Image Processing (ICIP '97), Volume 1, pp. 121-124.〕〔Andre, M., Popescu, V.G., Shaikh, A., Medl, A., Marsic, I., Kulikowski, C., Flanagan J.L. (1998). “Integration of Speech and Gesture for Multimodal Human-Computer Interaction”. In Second International Conference on Cooperative Multimodal Communication. 28–30 January, Tilburg, The Netherlands.〕〔Vo, M.T., Wood, C. (1996). “Building an application framework for speech and pen input integration in multimodal learning interfaces”. In Proceedings of the Acoustics, Speech, and Signal Processing (ICASSP’96), May 7–10, IEEE Computer Society, Volume 06, pp. 3545-3548.〕 The fused inputs are interpreted by the system. Naturalness and flexibility can produce more than one interpretation for each different modality (channel) and for their simultaneous use, and they consequently can produce multimodal ambiguity〔Caschera, M.C. , Ferri, F. , Grifoni, P. (2013). “From Modal to Multimodal Ambiguities: a Classification Approach”, Journal of Next Generation Information Technology (JNIT), Vol. 4, No. 5, pp. 87 -109.〕 generally due to imprecision, noises or other similar factors. For solving ambiguities, several methods have been proposed.〔Caschera, M.C. , Ferri, F. , Grifoni, P. (2013). InteSe: An Integrated Model for Resolving Ambiguities in Multimodal Sentences”. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Volume: 43, Issue: 4, pp. 911 - 931.18. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「multimodal interaction」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|